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ABSTRACT 
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introduced. The technique is called REDUNDALS. It is implemented 
within the comcuter pregram for canonacal correlation analysis calied 
CASALS. The RECUNDALS alisgorithm is of an aiternating .east sqzare 
{ALS) type. The techknicue is defined as minimization of a squared 
Gistance betwe2n criterion variabies and weighted predictor 
Variables. With the help of cotimai scaling, the variabies are 
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usec data from a survey conducted with members of the Dutch 
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independent of how much variance is expiained, while REDUNDELS 
explains aS much variance as possible in every criterion direction. 
Two tables provide information about the parliamentary study, and a 
figure illustrates the monotone transformations of the variables. & 
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Abstract 


A nonlinear version of reduncaucy analysis is 
introduced. The technique is called REDUNDALS. It is 
implemented within the computer progras for canonical 
correlation analysis called CAWZLS (Yan der Burg & De Leeuw. 
1982). The REDUNDALS algorithm is of an alternating least 
squares (AES) type. The technique is defined es minimization 
of a scuared distance between criterion variables and 
weighted predictor variables. With the kelp of optimal 
scaling the variables are transformed nonlinearly (cf. Young. 
1981). An application of redundancy analysis is provided. 
Key words: redundancy analysis, cenonicé! cor:elation 


éenilysis, optimal scaling, nunlineer transformation. 
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Nonlinear Redurdancy Analysis 


Introduction 


In many situations data are availeble from different sources. 
Suppose the data are of the form: objects x variables. and 
let us suppose the data from one source ccrrespond with a 
subset of variables. Ia case two (sub)sets of variables are 
available 2 possible technique to relate the sets to each 
other is canonicei correlation analysis (CCB). This technique 
is described in many cultivariate analysis textbooks (e.g. 
Tatsuoka. 1971. chep. 6:. Gnanadesixan. 1977. cheap. 3.3). Ia 
CCA the two sets of veriabies are treated symmetrically. But 
@ sysmetric treatment is not always returali. It also happens 
that it is clear from the data which variabies are predictors 
and which ones are criterie. In such cases redundancy 
en2aiysis (RZ) is 2 possible technique. 

The naze recundency analysis originates from Van den 
Wollenberg (1977). Aithough he was the first one to naze the 
technique, it actuallydates beck from an earlier period. De 
Leeuw (1986) discusses the history of RR. We briefly 
summarize it. Horst (1955). Rao (1962). Stewart & Love (1968) 
and Glahn (1969) 211 propose the Redundancy Index. Zao (1964) 
and Robert & Escoufier (19763 Giscuss techniques for 
Gecozposing this Reduadancy index to uncorrelated components 
Fort! .r (19€6) preposes “simultaneous lirear p: dictions’ 
which is ecuivaleat with RA (cf. Ten Serge. 1985). Izenmen 


(1975) and Davies & Tso (1982) aiso treat RA. but under the 
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name Reduced Rank Regression. So far the discussion of De 
Leeuw (1986). Johansson (1981) prceposes several forms of RA, 
which vary with ortnogoneality constraints. and DeSarbo (1981) 
disscusses 2 technique which is 2 mixture between CCA and RA. 
Yan de Geer (1984) places various types of 2A in a larger 
framework of k& sets CCA. Israels (1986) treats RA with 
various mormalizations and rotations. Meulman (1986. chap. 
§.2.1) discusses a version of Ri which can be shown to be 2 
generalization of Yan den Wollenberg’s PR. However Meulman 
uses 2 completely different approach. formulating 24 in terms 
of distances between objects or insdividu2is. We come will 
beck to this later. 

#4 nonlinear version cf ZA has been proposed by israéis 
(1984). His technigue makes it possible to incorporate 
qualitative variables by the use of Gumies’. Aliso Meyviman 
(1986. cheap. 5.2.1) discusses 2 nonlinear version of R228. 
Gealing with variables on an ordina! measurenent level. iu 
this paper another version of nonlinear 24 is proposed. & 
Jarger choice of measurement levels is possible for each 
variable than in case of israéis (1934). 

As the aigoritha for nonlineur redundency enaivsis 
shows meny correspondences with the a@igor:thm for nonlinear 
CC2 proposed by van der Burg & Pe Leeuw (1983). the computer 
program for nonlinear RR. called REDUNDALS. is ezpedded in 


che canonical correlation anelysis prusrean. calied CANALS. 
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Redundancy analysis 


Suppose the daia consist of observaticns for n objects on a 
variables, and essume thet the = variables can be divided 
into m, criterion variebles end mg predictors. In edditicn 
assume that each variable is standardized. i.e. it hes zero 
mean and unit variazce. Collect the criterion variables in 
the matrix H, of dimensions (nm x m;) and the predictors in H2 
(s x ©3). The Redundency Index of Stewart & Love (1968) is 


obtained by a multivariete multipie regression of hi. the 


coluzns of Hy. (isl.....2;) on H2. Thus 


{1) minimize £71 


i= 


(hi - Hgbj;>°Ch; - 89b;)/a, 


over by.. -»d,, 


e 


where the vector b, (2g elements}? Censists cf regression 
weights. The squared distesce or loss is diviced by 2 factor 


nm, for the sake of compering the various techniques. The 


= rd 


matrix formulation of (1) is: 
(2) minimize tr(#y - 873)°(H; -— B23)/nz; over 3B 
This expression is minimized dy 


(3) Be (82°H2)7*89°H: * . 
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provided that H7°H> is of full rank. Substituticn of (3) in 


(2) gives the minimum: 


(4) tr (Hy By - Hy “Ho (H2°H2)—1H9°81)/na, 


Denoting R33; for H,°H;/n and R32 and R22 for H;‘H2/n and 


H2°H2/n respectively. expression (4) is equivalent to 


(5) 4 — tr(Ry2R277? R22) /24. 


The expression tr(RzgRo97!R23)/a; is eguai to the Redundancy 
index of Stewart & Love (1968). Thus sinisizing (1) 
coxresponds to computing the Redundancy index. 

However this is not the sexe es perforsing ea sedundancy 
analysis in the sense of Van den Wollenverg (1977). He 
searches for (normalized) weights that optimize the explained 
variance between criterion variebles and weighted predictors. 
These weight vectors v (m) elemen*’s) are eigenvectors of the 
matrix Ro97*R21Ri2- Denote the corresponding eigenvalues by 


yu. Then 


(6) R227 1R21832¥ = pv with wR29v = 1. 


When ail v's are socived. the sus of eigenvalues equals the 
Redundancy indez (cf. Israéis. 1984). In fact we can see Van 
den Wollenberg’s analysis as a specialization of our 


minimization problem (2). namely the case in «hich there are 


ranx Testrictions on matrix B, i.e. B=vum with Vv {mg x 5). 8 
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(my x ©), 1lsrs<min(my.mz2), and normalization constraints on Vv, 
i.e. V'Ro97V=I. Expression (2) is rewritten in terms of V and 


Was foilows 


(7) minimize tr(Hy — EQVW")° (Hy — B2VW')/nm, over V and W 


subject to the condition that V’Rj9V=1. 


Some computational work shows that the columns of V 
correspond to the vectors v discussed above. hote that Van 
Gen wWollenberg has the choice of cr. i.e. bow many 
eigenvectors v will be computed. In our case automatically 
all weights B are solved for. as this is implicit to the way 
(2) is formulated. Kithough (7) is more restrictive than (2). 
“@ Can argue thet formyiation (7) is the more general one, as 
(7) can be solved for r=n, (assuming that mj<m2). and for 
lower vaiues of r. 

Expression (73 also shows the relation between reduced 
Tank regression and redundancy analysis. as reduced rank 
regression corresperds to (7) with small rc (c.f. De Leeuw. 
Meoijaart & Yan der Leeden. 1985}. To recognize other forms 
of RA it is necessary to formulate expression (7) in a2 


different way. Define matrix X (n x r) as HoV. Then we get 


(8) minimize {tr(X—-H9V) “(X-HoV) + tr(Ey—AW" )° (By—XW" ))} /nm, 


over X. V and W. subject to the conditions that 
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X = HOV and Ryxx = I. 


Matrix R,, is equal to xX’xX/n. Meulman (1986. chap. 5.2.1) 
discusses the minimization of the loss as formulated in (8), 
subject to the condition that only R,,=I. Thus X does not 
have to be in the column space of H2. De Leeuw & Bijleveld 
(1987) deal with the same loss function, but they use the 
condition Ryx=021, where a is a parameter. They show that 
different vaiues of a correspond to various multivariate 
techniques. e.g. a=0 boils down to principal component 
analysis (PCA). and oa->e corresponds to reduced rank 


regression. 


Optimal scaling 


In many weys nonlinear transformations can be implemented in 
redundancy analysis. To do so Israéls (1984) employed dummies 
for variables measured on a nominal measurement level. 
Meulman (1986, chap. 5.2.1) uses monotone regression in her 
version of nonlinear RA. Monotone regression is a form of 
optimal scaling (cf. Young, 1981). This means that the 
transformations (scaling parameters) minimize the loss, and 
at the same time measurement restrictions are maintained. We 
also use optimal scaling. The nonlinear transformations 
treated in this article are nominal and ordinal (a definition 
will follow). In addition, of course. linear or numerical 


transformations are dealt with. ‘Dummy transformations’, as 
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employed by Israéls (1984), are not discussed. however they 
can always be obtained by simply coding variables as dummies. 
and, in addition, by treating these dummies numerically. 
Another way to obtain these “dummy transformations’ is by 
using copies of a variable within the corresponding set. and 
by treating these copies as nominal. This gives a eultipie 
nominal (or dummy) transformation (cf. Gifi, 1981, chap. 
5.2.7). Using copies instead of dummies has the advantage 
that one may choose both the dimensionality of the 
transformation and the measurement level of each ccpy 
separately. More information about copies can be found in De 
Leeuw (1984) and Yan der Burg & De Leeuw (1987). 

she nominal, ordinal and numerical transformations 
employed in this article agree with the transformatiors used 
by Van der Burg & De Leeuw (1983) in their version of 
nonlinear CCA (CANALS). Tegether these three transformations 
form the optimal scaling. Our definition of optimal scaling 
corresponds to the definition of Young (1981). We mentioned 
already that optimal scaling refers to the fact that 
variables are optimally scaled in the sense of the modei. 
This means that the data matrices H, and Hg are replaced by 
parameter matrices Q; (n x m,) and Q2 (2 x a9) such that they 
optimize the model, i.e. minimize the original loss, but at 
the same time satisfy the measvrement restrictions. The 
original loss was formulated in (2). If the parameter matrix 
Q; is subsituted for H; and Q2 for Hz. this expression can be 
rewritten as follows. Denote the set of possible 


transformations for the ith variable, i.e. ith colum of 
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{M,.M2). by Cy anc use the scotetion ¢; for the ith colum of 
{Q;.Q@2]- Nonlinear reccndancy analysis ez ZEDGKDLIS is 


(9) minimize tr(@; — 0:32°(@; — Q,3)/am, 


cvrer Q:. @2 ant B. sopjecs to the condition thet 


Gs € Tz (i=3...-.. mn}. 


The sets of possibie treasfosmetseoes ese determines ty tie 
and mormalizetion restriczions for nominal rvrariadies, ead. is 
Sditicen. by menetcar coastreists for ordiine? vaoriebies or by 
jinmeer cocstrseinis fer sumerscai rerieples icf. De Lee. 
39773. Tie restrictions impiy chet cies in the dete 
corresponms to ties in che trensfommeticn Sommalizetio 
sesisicticons Tesu2t in stendasdizced traensicrmetiens (i-4. 
wero mean anc wit weriauce)?. The moncteme trensfemmetiocs 
discus.ed bere correspond to the seconteurr approach of 
Eruskel & Saepherc (1974)  Finaliy linear treastommerions ere 
egu2i wo che variables itself. as standasd:zetion of she 
colmms of the cata metrizx «es sxpposed. & more extensire 
Gisczussiozs of eptimal sceliing restriczi can be founéd sin 


Young. Be Leemw & Texene (19743 and Young (1723). 
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